Historical fragmentation and stepping‐stone gene flow led to population genetic differentiation in a coastal seabird

Abstract Understanding the forces that shape population genetic structure is fundamental both for understanding evolutionary trajectories and for conservation. Many factors can influence the geographic distribution of genetic variation, and the extent to which local populations differ can be especially difficult to predict in highly mobile organisms. For example, many species of seabirds are essentially panmictic, but some show strong structure. Pigeon Guillemots (Cepphus columba; Charadriiformes: Alcidae) breed in small colonies scattered along the North Pacific coastline and feed in shallow nearshore waters year‐round. Given their distribution, gene flow is potentially lower and population genetic structure is stronger than in most other high‐latitude Northern Hemisphere seabirds. We screened variation in the mitochondrial control region, four microsatellite loci, and two nuclear introns in 202 Pigeon Guillemots representing three of five subspecies. Mitochondrial sequences and nuclear loci both showed significant population differences, although structure was weaker for the nuclear loci. Genetic differentiation was correlated with geographic distance between sampling locations for both the mitochondrial and nuclear loci. Mitochondrial gene trees and demographic modeling both provided strong evidence for two refugial populations during the Pleistocene glaciations: one in the Aleutian Islands and one farther east and south. We conclude that historical fragmentation combined with a stepping‐stone model of gene flow led to the relatively strong population differentiation in Pigeon Guillemots compared to other high‐latitude Northern Hemisphere seabird species. Our study adds to growing evidence that Pleistocene glaciation events affected population genetic structure not only in terrestrial species but also in coastal marine animals.


| INTRODUC TI ON
Understanding the extent to which local populations of a species differ genetically is fundamental both to understanding of evolution and ecology and to conservation.Population genetic structure is influenced by many factors, such as dispersal patterns and population history, so can be difficult to predict in species that have not been studied directly.For example, most seabirds have strong dispersal abilities, so gene flow is generally high and population genetic structure correspondingly weak (e.g., Little Auk Alle alle, Wojczulanis-Jakubas et al., 2014;reviewed in Friesen, 2015;Friesen et al., 2007;Lombal et al., 2020).However, strong population genetic structure exists in some seabird species, suggesting that gene flow is somehow restricted (e.g., Whiskered Auklet Aethia pygmaea, Pshenichnikova et al., 2015).In some species, genetic differentiation is explained by demographic history; for example, population fragmentation by Pleistocene glaciers has led to differentiation through genetic drift in many species (Lombal et al., 2020).In other cases, differences in nonbreeding distributions may promote population differentiation; for example, seabirds that reside at colonies year-round or that migrate to population-specific wintering areas may have less opportunity for gene flow than do those with a single, species-specific migratory destination (e.g., Thick-billed Murre Uria lomvia, Tigano et al., 2015, but see Quillfeldt et al., 2017).Similarly, species that forage inshore have less opportunity for gene flow among colonies than do more pelagic feeders, and accordingly, inshore foraging is sometimes associated with population genetic structure (Wiley et al., 2012).Importantly, gene flow and population genetic structure may also be influenced by the geographic distribution of colonies: species that nest in small colonies scattered along coastlines likely follow a one-dimensional stepping-stone model of gene flow where breeding recruits disperse only short distances from their natal colony.Under this model, gene flow is less effective at countering genetic drift than in an n-island model (Kimura & Weiss, 1964).
Guillemots (Cepphus spp.; Charadriiformes: Alcidae) differ from most seabirds in several key traits, and so provide useful systems for investigating mechanisms of population differentiation in highly mobile species.Although guillemots have the potential to disperse hundreds of kilometers from natal sites (e.g., Johnston et al., 2018), several aspects of their natural history may restrict gene flow: guillemots avoid flying over large expanses of land or ice (Udvardy, 1963); they generally breed in small colonies on islands and headlands (Gaston & Jones, 1998); they forage on small demersal fish in shallow water near breeding colonies (e.g., Golet et al., 2000); and they winter close to their breeding colonies (Ewins, 2020).Furthermore, given that guillemots breed in small colonies scattered along coastlines and feed nearshore, often near sea ice (Divoky et al., 2016), they could have survived the Pleistocene glaciations in small coastal refugia (Roberts & Hamann, 2015;Waltari et al., 2007).Accordingly, Kidd and Friesen (1998a) found that mitochondrial DNA (mtDNA) sequences vary among regional populations of Black Guillemots (C.grylle) in the North Atlantic, due at least partly to historical fragmentation by Pleistocene glaciers.
Pigeon Guillemots (C.columba) breed along coastlands of the North Pacific Ocean (Figure 1), with an estimated total population of less than 69,000 breeding birds (Kushlan et al., 2002).The possibility that local populations of Pigeon Guillemots may differ genetically is suggested by geographic variation in morphology: five subspecies have been designated based on clines in wing, tarsus, and culmen lengths (increasing from north to south), and quantity of white on the wing (increasing from south to north; Storer, 1952).
Furthermore, several co-distributed and ecologically similar seabird species have strong population genetic structures (e.g., Marbled Murrelets Brachyramphus marmoratus, Congdon et al., 2000;Kittlitz's Murrelets B. brevirostris, Birt et al., 2011).Kidd and Friesen (1998a) analyzed geographic variation in the conserved central domain of the mitochondrial control region (mCR) of Pigeon Guillemots and found evidence of population genetic structure.However, their geographic sampling was limited and their study included only mtDNA, which often differs in geographic variation from nuclear DNA due to differences in mode of inheritance and mutation rates (e.g., Ando et al., 2014;Eda et al., 2008;Walsh & Edwards, 2005).
In the present study, we analyzed variation in the more variable sections of the mCR and six noncoding nuclear loci among Pigeon Guillemots sampled throughout the northeastern Pacific to characterize their population genetic structure and historical demography.
Given their natural history (above), we hypothesized that local populations would differ at neutral molecular loci and that genetic differences between local populations would correlate with geographic distance.Furthermore, we hypothesized that geographic variation in DNA sequences would reflect historical fragmentation driven by Pleistocene glaciation events.

| Sampling and DNA screening
Solid tissue or blood samples were collected from 202 Pigeon Guillemots from 22 breeding sites within 11 locations (archipelagos, inlets, and bays) ranging from the western Aleutian Islands to central California (Table 1, Figure 1).Most samples from Alaska comprised solid tissue (heart, liver, or striated muscle) from adults in breeding condition collected near colonies for dietary analyses (e.g., Hobson et al., 1994).Samples from elsewhere generally consisted of blood from adults caught at nests, opportunistically collected dead chicks and eggs, and growing coverts ("blood" or "pin" feathers) plucked from chicks (one per nest).Samples are archived at Queen's University, the Royal Ontario Museum, the Burke Museum, the American Museum of Natural History, and the University of Alaska Museum at Fairbanks.DNA was extracted using a standard protease K, RNase, phenol/chloroform technique followed by ethanol precipitation (Sambrook et al., 1989).Guillemot-specific DNA primers (CGL56 and CGH549, and CGL486 and CGH1006) were used to amplify and sequence two overlapping fragments of the mCR, including parts of Domains I, II, and III, following protocols detailed in Kidd and Friesen (1998b).
Sequences were obtained for 723 base pairs (bp) of the mCR, excluding the first 141 bp downstream (3′) of primer CgL56 and a 92 bp region between primers CgL486 and CgH589, both of which were ambiguous due to highly repetitive simple sequence motifs.
Sequences were aligned using the program BioEdit (Hall, 1999), and haplotypes were identified using the program TCS (v 1.13, Clement et al., 2000).
The presence of length variation was determined for four microsatellite loci (Appendix 1; Table A1) either by electrophoresis through polyacrylamide gels (Ibarguchi et al., 2000) or using the GenomeLab GeXP Genetic Analysis System (Beckman Coulter Inc., USA) and associated software (10.2.3).Multiple samples were analyzed using both systems to test both for repeatability and for scoring differences due to platform effects; none were found.
Variation in six nuclear introns (Appendix 1; Table A1) was screened using a combination of single-stranded conformational polymorphisms (SSCPs) and direct sequencing as in Friesen et al. (1997).Individuals from southeast Alaska were sequenced directly using a 3730xl DNA Analyzer Platform (Applied Biosystems, CA) operated by Genome Quebec (McGill University, Montreal, Quebec).Sequences were trimmed and aligned using the program Geneious (Kearse et al., 2012), variable sites were scored from chromatograms, and haplotypes were phased using the program PHASE (v.2.1; Stephens & Donnelly, 2003;Stephens et al., 2001).
ARLEQUIN was also used to test mitochondrial variation for deviations from neutrality using Ewens-Watterson and Chakraborty tests (Chakraborty, 1990;Ewens, 1972;Watterson, 1978) and to test nuclear loci for deviations from Hardy-Weinberg proportions and linkage equilibrium.

| Population genetic structure
The proportion of genetic variation distributed among populations (Φ ST for mCR sequences; F ST for nuclear loci) was calculated F I G U R E 1 Breeding distribution (colored lines) of Pigeon Guillemots, and breeding sites where samples were collected (circles).Sampling site codes are given in Table 1.Redrawn from Udvardy (1963).
by analysis of molecular variance (AMOVA) in ARLEQUIN, both for the entire dataset and for pairwise comparisons of sampling locations.Statistical significance was assessed by randomization using 10,000 permutations of the data with a rejection level (α) of .05.Kimura's 2-parameter model of sequence evolution (Kimura, 1980) with a gamma value of .42 was used for mCR sequences (Marshall & Baker, 1997); use of the Tamura-Nei model (Tamura & Nei, 1993)  A principal component analysis (PCA) was performed on the nuclear data in R (3.3.1,R Core Team, 2017) using the packages adegenet (Jombart, 2008) and ade4 (Dray & Dufour, 2007).PCAs were run with samples grouped either by sampling location or by subspecies.
The program STRUCTURE 2.3.4 (Pritchard et al., 2000(Pritchard et al., , 2010) ) was used as an additional test of population structure in nuclear variation.The program was initially run under the admixture model with correlated allele frequencies and without sampling location as prior information, with a burn-in of 10,000 iterations and 100,000 iterations after the burn-in.Following the recommendation of Hubisz et al. (2009) for species with weak structure, the model was subsequently run without admixture, with sampling location as prior information.Each value of K (number of genetic populations) from 1 to 5 was run 10 times, and delta K (ΔK, the most probable number of genetic populations) was calculated using STRUCTURE HARVESTER (Earl & vonHoldt, 2012;Evanno et al., 2005).Because ΔK cannot test if the most likely value of K is 1, the log-likelihood of each K value (ln Pr(x|K)) also was calculated using the equation provided in the STRUCTURE manual (Pritchard et al., 2010).Membership probabilities were averaged among runs using CLUMPP (v.1.1.2, Jakobsson & Rosenberg, 2007) and results were displayed using DISTRUCT (v.1.1, Rosenberg, 2004).
Because the program STRUCTURE may detect artificial genetic clusters under isolation by distance (Perez et al., 2018), Mantel tests (Mantel, 1967) were used to test for correlations between pairwise estimates of Slatkin's linearized Φ ST (mCR) or F ST (mCR and nuclear loci) with log-transformed geographic distance between sampling locations using the R package ecodist (Goslee & Urban, 2007) with 10,000 permutations of the data.The shortest distance between sampling locations was calculated using the package 'geosphere' (Hijmans et al., 2017).Where samples were collected from multiple breeding sites within a location (e.g., for comparisons within Andreanof Islands, Table 1), the geographic midpoint of sampling sites was used.
TA B L E 1 Subspecies, sampling locations, location abbreviations, sampling sites, site abbreviations, and numbers (n) of Pigeon Guillemots analyzed for variation in mtDNA and nuclear DNA.See Figure 1 for locations of sampling sites.The program BEAST 1.8.4 was also used to construct a gene tree for the mCR sequences (Drummond et al., 2012).Individuals were grouped by sampling location (  & Friesen, 1998b).The Hasegawa-Kishino-Yano substitution model (HKY) with a proportion of invariable nucleotide sites (I = 0.796) and gamma distribution (G = 0.733; Hasegawa et al., 1985) was selected as the best model using the Bayesian information criterion (BIC) in jModelTest2 (Darriba et al., 2012).A strict molecular clock with a mutation rate of 7.4%/my (million years; Wenink et al., 1996) and the coalescent tree prior that assumes a constant population size with Jeffrey's prior (Drummond et al., 2005;Jeffreys, 1946;Kingman, 1982) were selected.The BEAST analysis was run for 10 7 MCMC (Markov chain Monte Carlo) runs with a burn-in of 10 6 runs.
TRACER 1.6 (Rambaut et al., 2014) was used to analyze the trace files generated by BEAST and to assess convergence.The analysis was run 3 times and output files were combined using LogCombiner (v.1.8.4), with a burn-in of 10% of trees for each original file.A consensus tree was generated in TreeAnnotator (v. 1.8.4, Drummond et al., 2012) and the tree was displayed graphically using FigTree (v. 1 .4.3, Rambaut, 2007).

| Population simulations
To test alternative models of population history, mCR variation was compared to simulated variation generated under three potential scenarios using coalescent-based approximate Bayesian computation (ABC) in DIYABC (Cornuet et al., 2008).This approach involves modeling different historical scenarios and simulating population sequence evolution to fit each scenario.The simulated datasets are then compared to indices of variation in the observed dataset to determine which scenario best fits the data.Based on the observed geographic distribution of mCR variation (see Section 3), samples were of Pigeon Guillemots was estimated to be 8.8 years (Hudson, 1985).
Times of divergence of the ancestral populations (t2 and tc) were set to 1250-100,000 generations ago (11,000-880,000 years ago [ya]; i.e., before the end of the Wisconsin glaciation).Admixture events and more recent divergences (t1, ta, and tb) were set to 0-1250 generations ago (present day -11,000 ya; i.e., after the end of the Wisconsin glaciation).Rate of admixture (ra) was left as the default (0.001-0.999), and the HKY + I + G substitution model was selected, as above.Mean mutation rate was set to 6.5 × 10 −7 per site per generation based on a sequence divergence rate of 7.4%/my and a generation time of 8.8 years (above).Eight single-sample summary statistics were used for each population: number of distinct haplotypes, number of segregating sites, mean pairwise difference, variance of the number of pairwise differences, Tajima's D statistics (Tajima, 1989), number of private segregating sites, mean of the numbers of the rarest nucleotide at segregating sites, and variance of the numbers of the rarest nucleotide at segregating sites.Additionally, 5 two-sample summary statistics were used: number of distinct haplotypes in the pooled sample, number of segregating sites in the pooled sample, mean of within-sample pairwise differences, mean of between sample pairwise differences, and pairwise F ST (Hudson et al., 1992).Three million simulations were run for each scenario, and posterior probabilities were compared using logistic regression (Cornuet et al., 2010;Inoue et al., 2014).Type I and Type II error rates were estimated using the method described by Cornuet et al. (2010) and Inoue et al. (2014).

| Molecular variation and tests of assumptions
Mitochondrial control region sequences were similar to those published previously for guillemots (Kidd & Friesen, 1998a, 1998b), and expected, p = .02;Table 2); however, these samples did not show a significant deviation from neutrality using the Ewens-Watterson test.
All nuclear loci were variable.No null alleles were detected by MICROCHECKER for microsatellite data.None of the nuclear loci showed significant deviations from Hardy-Weinberg proportions within any of the sampling locations except for Ribosomal Protein 40 (RP4 intron V) in the Strait of Georgia (Table 3).Six of the 11 sampling locations showed evidence of linkage disequilibrium between different loci (Table A2).Tests for population genetic structure were rerun excluding one locus from each pair and results remained consistent; all loci were therefore retained for subsequent analyses.
Six alleles, defined by variation at four (Cytochrome C intron I) and 15 nucleotide sites (RP40) were found within each of the two introns.
Alleles differed by one to 12 substitutions.Number of alleles at microsatellite loci ranged from 3 to 11.For all six nuclear loci, several alleles were present at high frequency in most sampling locations, and the remaining alleles occurred in only one or two individuals each.

| Population genetic structure
Of the 81 mCR haplotypes, 64 were private (found in only one sampling location), including five at high frequencies (over 25%).With  4).
Estimates of global population genetic structure for nuclear loci were statistically significant but slightly lower than for the mCR (F ST = 0.09 for all nuclear loci; F ST = 0.11 for microsatellites only; F ST = 0.05 for introns only; all p < .001).Pairwise estimates of F ST were statistically significant for most location comparisons for all nuclear loci combined and for microsatellites and for some pairwise comparisons for introns (Table 4; Table A3).
No population genetic structure was obvious in the PCA when Pigeon Guillemots were grouped by sampling location (Figure 3a).
When populations were grouped by subspecies, C. kaiurka and C.
eureka were largely separated from each other, whereas C. adianata overlapped with both other subspecies (Figure 3b), with the first three principal components explaining 17.5% of the variance in both analyses.
Estimates of both ΔK and Ln Pr(x|K) from analyses of nuclear variation using STRUCTURE indicated that the most likely value of K for Pigeon Guillemots is 2 (Figure 4; Table A4).Genetic groups were not completely segregated geographically, with most individuals showing some probability of assignment to both genetic populations.
Mantel tests for correlations between genetic and geographic distance were statistically significant both for mCR variation (r = .56,p < .05)and for all nuclear loci (r = .59,p < .05; Figure 5).
Given the divergence between mtDNA haplotypes of samples from the Aleutian Islands versus elsewhere, Mantel tests were repeated after removing samples from the Aleutian Islands (Andreanof Island and Fox Island), as most of these samples were distinct from those from other locations (Figure 6, Figure A1): correlations remained significant but slightly weaker (mCR, r = .52,p < .05;all nuclear loci, r = .52,p < .05).The Mantel test for mtDNA variation was also rerun based on haplotype frequencies only (i.e., F ST rather than ϕ ST ): the result was no longer significant (r = .14,p = .18;plot not shown), suggesting that the correlation is driven at least partially by divergence between Aleutian and other mtDNA lineages.

| Demographic history
Divergence times for the mtDNA sequences of the Black Guillemot outgroup versus Pacific species of guillemots were estimated from BEAST at between 0.63 and 1.2 mya (with a posterior  probability of 1), which is slightly more recent than the estimate of Kidd and Friesen (1998a; 1.5 mya; Figure A1).Spectacled and Pigeon Guillemot mtDNA lineages were estimated to have diverged between 0.43-0.87mya (with a posterior probability of 1), which is similar to the date estimated by Kidd and Friesen (1998a; 0.8 mya).
Both the TCS and BEAST gene trees separated mtDNA sequences of most Pigeon Guillemots from the Aleutian Islands (Andreanof and   1. Fox Islands) from those from other sampling locations, with two exceptions: sequence from one individual from Shumigan Island grouped with the Aleutian samples, and sequence from one individual from Fox Island mixed with samples from the other locations (Figure 6 and Figure A1); these haplotypes were confirmed both by analysis of SSCPs and by direct sequencing.Support from BEAST for monophyly of the mtDNA lineages of Pigeon Guillemots from the Aleutian Islands versus those from elsewhere was strong, with posterior probabilities of 1.The mtDNA lineages of Pigeon Guillemots in the Aleutian Islands versus elsewhere were estimated to have diverged between 0.16-0.32mya.Relationships among mtDNA lineages within these two clades were not well resolved.
Scenario 1 (isolation of guillemots in two refugial populations, followed by divergence within the southern population) was the most highly supported by DIYABC, with a strong posterior probability (0.83).The other two scenarios had much lower support.Type I and Type II error rates for all three scenarios were low (Table A5).

| DISCUSS ION
Our objectives were to characterize population structure in neutral molecular loci in a coastally distributed seabird and to test for historical fragmentation by Pleistocene glaciers.We found that mCR sequence variation was strongly structured geographically and that significant population structure also occurred in nuclear DNA despite the small number of loci.Genetic differences between sampling locations correlated with geographic distance, supporting a stepping-stone model of gene flow.Evidence was also found for population fragmentation by Pleistocene glaciation events.

| Population genetic structure
Estimates of ΦST for mCR sequences were significantly greater than zero between almost all sampling locations, with many private haplotypes and significant divergence between most haplotypes sampled from the Aleutian Islands versus elsewhere.Nuclear variation also was structured geographically: estimates of F ST between many sampling locations were statistically significant; PCA indicated some differentiation among subspecies; and analyses with STRUCTURE indicated two genetic populations.Although geographic structuring in nuclear loci was weaker than in mtDNA, this pattern is common in birds (e.g., Sonsthagen et al., 2011;Zink & Barrowclough, 2008) and consistent with the lower effective population size and correspondingly stronger genetic drift in mtDNA compared to nuclear DNA.Furthermore, a pattern of isolation by distance was suggested by several results: haplotype sharing between neighboring locations; correlations between genetic and geographic distance for both mCR and nuclear variation; and mixed probabilities of assignment of individuals to each of two genetic populations by STRUCTURE, with assignment probabilities changing roughly clinally.Additionally, the existence of four mitochondrial haplotypes shared between non-adjacent sampling locations is suggestive of occasional long-distance gene flow.

| Demographic history
In theory, the extent of population genetic differentiation is influenced by many factors, one of which is the pattern of gene flow (Wright, 1969).
As described in the Introduction, guillemots likely disperse according to a one-dimensional stepping-stone model of gene flow (along coastal habitat).If migrants follow an n-island model of gene flow (i.e., are exchanged at random among breeding locations), then in theory one migrant per generation will homogenize allele frequencies, and genetic differentiation between locations will be independent of geographic distance (Wright, 1969).However, if migrants disperse according to a stepping-stone model (i.e.primarily between neighboring breeding locations), then gene flow will be less effective at counteracting genetic drift (Kimura & Weiss, 1964), and genetic differentiation between locations should increase with distance (Hutchison & Templeton, 1999;Mantel, 1967).Genetic distance correlated with geographic distance in several seabirds reviewed by Friesen et al. (2007)  Meles meles, Pope et al., 2006).
Correlations between genetic and geographic distance can result from either recent differentiation in situ or historical fragmentation followed by population expansion and secondary contact (Hutchison & Templeton, 1999).In the present study, both of the mitochondrial gene trees showed a disjunction in sequence variation between most haplotypes from the Aleutian Islands and locations farther east and south.The distinctiveness of the Aleutian haplotypes suggests that guillemots from these islands were historically isolated from those elsewhere.Estimates of Φst and FST were also greatest between samples from the Aleutian Islands and those elsewhere, suggesting that they experienced limited gene flow in more recent years.The split between mtDNA lineages of the Aleutian Islands and other populations was estimated by BEAST to have occurred 0.16-0.36mya, before or during the Illinoian glaciation when ice sheets would have fragmented the Pacific coast of North America (Dyke, 2004).Results from DIYABC using mtDNA also suggested that, of the three demographic histories modeled, the most likely scenario was a historical split between populations in the Aleutian Island and those farther These findings are consistent with Udvardy's (1963) hypothesis that geographic variation in Pigeon Guillemots is due to isolation in multiple glacial refugia.
Pleistocene refugia are often discussed from the perspective of terrestrial species that cannot travel over large expanses of ice (e.g., Holder et al., 2000;Wang et al., 2021).Coastal species such as guillemots could potentially have persisted in small numbers in ice-free outcrops (e.g., Haida Gwaii).However, the Cordilleran Ice Sheet at its peak extended to the edge of the continental shelf, potentially fragmenting even coastal marine species between the western end of the Aleutian Islands and northern Washington (Dyke, 2004)

| Taxonomic and conservation implications
According to Storer (1952), subspecies of Pigeon Guillemots show marked morphological variation along a latitudinal cline.

TA B L E A 1 PCR primers and annealing
temperatures for nuclear loci screened in Pigeon Guillemots.
optimized, and PCR products were sequenced to confirm the presence of the expected dinucleotide repeats.

INTRONS
Amplifications were attempted on four to six test samples each with 30 pairs of PCR primers previously designed to amplify nuclear introns from vertebrates (Friesen et al., 1997(Friesen et al., , 1999, unpublished data) using previously published protocols with the addition of 62.5 μg mL −1 BSA and 0.01 mg mL −1 gelatin.Various annealing temperatures and concentrations of MgCl 2 and DMSO were tested to optimize amplifications.
Two loci for which clean sequence could be derived consistently were then chosen for population screening (Table A1).  1. TA B L E A 5 Type I and Type II error rates and posterior probabilities for each scenario calculated from DIYABC.

TA B L E A 2
as in Ni et al. (2012) gave virtually identical results.Potential Type I statistical errors were addressed by applying Benjamini-Yekutieli (B-Y) corrections to pairwise comparisons in R (Benjamini & Yekutieli, 2001; Narum, 2006; R Core Team, 2018).
grouped into three regions to test alternative roles of Pleistocene refugia: "Aleutians"(Andreanof Is. and Fox Is.); "Central" (Shumigan Is., Semidi Is., Cook Inlet, Prince William Sound, and Southeast Alaska); and "South" (West Vancouver Is., Strait of Georgia, Central Oregon, and Central California).Scenario 1 (Figure2) simulated two refugial populations during the last glaciation, with Aleutian populations diverging from the others pre-glaciation (t2) with no subsequent gene flow, and central and southern populations diverging post-glaciation (t1).Scenario 2 simulated isolation of Pigeon Guillemots in a single southern glacial refugium, followed by step-wise colonization northward post-glaciation.Scenario 3 tested that Pigeon Guillemots were historically isolated in two refugia (Aleutian and southern) at t2 with secondary contact in the central region post-glaciation at t1.The prior distribution for all historical parameters was set to uniform.Due to the absence of published data for genetically effective population size (N e ), broad priors were used (10-100,000).Mean generation time contained the conserved sequence blocks typical of other avian species (F, D, and C Boxes and CSB-1; Marshall & Baker, 1997).All mitochondrial and intron sequences have been deposited in GenBank, accession numbers PP593249-PP593336.Population-specific haplotype and allele frequencies have been deposited in Dryad https:// doi.org/ 10. 5061/ dryad.m37pv md9m.Eighty-five mCR haplotypes, defined by 78 variable sites, were identified.Fifty-nine variable nucleotide sites occurred in Domain I, 17 in Domain II, and two in Domain III.Ewens-Watterson and Chakraborty tests did not detect any deviations from neutrality except in the Strait of Georgia (Mandarte Island): in the Chakraborty test, the Strait of Georgia samples had significantly more haplotypes than expected (14 observed vs. 8.6

F I G U R E 2
Alternative demographic histories tested using DIYABC.All three scenarios assume three population groups: (1) "Aleutian Islands" (Andreanof Is., and Fox Is.); (2) "Central" (Shumigan Is., Semidi Is., Cook Inlet, Prince William Sound, and Southeast Alaska); and (3) "South": West Vancouver Is., Strait of Georgia, Central Oregon, and Central California at the present time (t0), and these populations diverged in the past.The time scale is shown on the right.shared haplotypes were found in geographically adjacent locations; the four that were found in non-adjacent locations occurred at low frequency in one location.The global AMOVA indicated statistically significant geographic structure in sequence variation (Φ st = 0.39, p < .001),and pairwise estimates of Φ st were significant for most comparisons (Table
Genetic distance (Slatkin's linearized F ST ) versus geographic distance (log-transformed linear distance) between sampling locations for Pigeon Guillemots.White circles represent comparisons made using mitochondrial data and black circles represent comparison made using nuclear data.The lines of best fit for both the analysis with mitochondrial data and with nuclear data are shown in red.F I G U R E 6 Statistical parsimony haplotype network derived for control region sequences of Pigeon Guillemots using PopART.Sizes of circles are proportional to haplotype frequencies.Black dots represent inferred intermediate haplotypes that were not found in the present sampling.Haplotype names have been removed for clarity.Region abbreviations as in Table east and south prior to the Wisconsin glaciation, with the latter populations diverging from each other post-glaciation.Conclusions drawn from DIYABC should be treated with caution, since the program can only test models that are supplied by the researcher, and results here are based on only mtDNA variation.Nonetheless, our results suggest that differentiation of Aleutian Island vs. other North American Pigeon Guillemots may be explained at least in part by historical fragmentation, probably by extensive Pleistocene ice fields that would have separated tracts of rocky coastline from each other.
Significance at α = .05is indicated by +.Location abbreviations are in Table

Table 1 )
, and the gene tree was rooted with sequences from one Spectacled Guillemot (C.carbo) and one sample of each of four subspecies of Black Guillemot (C.g. arcticus, C.
g. grylle, C. g. mandti, and C. g. ultimus; GenBank Accession Numbers AF027220, AF027225, AF027231, AF027233, and AF027251; Kidd Number of haplotypes, haplotype diversity (h), nucleotide diversity (π) as a percent (+1 standard deviation), observed/expected F values for Ewens-Watterson test of neutrality, and probabilities of neutrality from Chakraborty's test; one significant value is in bold.Note: 'NA', not applicable due to small sample size.Sampling location abbreviations are in Table1.
TA B L E 2TA B L E 3 Observed/expected heterozygosity estimates for nuclear loci screened for Pigeon Guillemots.Sampling location abbreviations are in Table1.*Significantlydifferent at α = .05both before and after B-Y correction.
Estimates of F ST based on six nuclear loci (above diagonal) and estimates of Φ ST based on sequence variation in the mitochondrial control region (below diagonal) for pairwise comparisons of sampling locations.Location abbreviations are in Table1.

Locus Primer sequence (forward/reverse) Annealing temperature (°C)
Congdon et al. (2000)0) in this species varies geographically, population differences at neutral molecular loci do not fully align with current subspecies delineations: samples of C. c. adianata from the Fox Islands grouped with C. c. kaiurka on the mitochondrial gene trees; furthermore, demographic simulations suggested that guillemots from the Aleutian Islands survived the Pleistocene in a refugium separate from guillemots farther east and south.Thus, the location of the geographic boundary between these two subspecies should be revisited.Outside the Aleutian Islands, variation in neutral loci tended to increase with geographic distance between sampling locations, blurring the lines between C. c. adianta and C. c. eureka.Perez et al. (2018) cautioned against inferring population units from clustering programs such as STRUCTURE under patterns of isolation by distance.Analysis of a larger number of loci may improve resolution of fine population structure and subspecies boundaries.Samples of C. c. columba and C. c. snowi also need to be included in future analyses.Cco5-21R: 5′-AGAGTTGCACAGGTTAAATACC-3′).Six samples were tested for amplification using these primers as well as primers previously developed for microsatellites for Marbled Murrelets (Brachyramphus marmoratus,Friesen et al., 2005), Thick-billed and Common murres (Uria lomvia and Uria aalge, respectively,Ibarguchi et al., 2000)and Yellow Warblers (Dendroica petechia,Dawson et al., 1997).MgCl 2 concentrations and annealing temperatures were a Present study.bDawsonetal.(1997).cIbarguchietal.(2000).dCongdonetal. (2000).
Table of significant linkage disequilibrium for nuclear loci within sampling locations.Estimates of F ST for pairwise comparisons of sampling locations of Pigeon Guillemots based on introns (above diagonal) and microsatellites (below diagonal).Location abbreviations are in Table1.Significantly different from 0 at α = .05before B-Y correction.**Significantly different from 0 at α = .05after B-Y correction.Estimated mean and standard deviation (Stdev) of Ln Pr(K), and delta K calculated for values of K from 1-5 using data from STRUCTURE in STRUCTURE HARVESTER.All analyses run with correlated allele frequencies for 10 replicates with (A) admixture, without location as prior information, and (B) without admixture, with location as prior information."n/a" indicates not applicable.The most likely value is in bold.
*TA B L E A 4